AD-A|66L174 


zt)  Ma>  199  3 


2/15/92  to  2/14/93 


BnntACv 


Reference  Frames  in  Vision 


Prof.  Mary  M.  Hayhoe 


s 


DTIC 

ELECTE 

JUN2  3  1993 

c 


Center  For  Visual  Science 
College  of  Arts  And  Science 
University  of  Rochester,  NY  14627 


AE0SR-T8- 


AFOSR  No  91-0332 

u  uoa^ 

^3  13 


1/  o 


AFOSR/NL 

110  Duncan  Avenue,  Suite  B 1 1 5 
Bolling  AFB  DC  20332-0001 

M-  C«>L  Collins 

Summary  of  an  Annual  Progress  Report 


Approved  for  public  release; 
distribution  unlimited 


93-14030 


This  research  examines  the  consequences  of  observer  motion  for  visual  functioning. 
Two  major  visual  issues  are  addressed.  The  first  issue  is  hov  a  grossly  time- 
varying  retinal  input  (due  to  eye,  head,  and  body  motion)  results  in  the  perception 
of  a  continuous  and  directionally  stable  world.  The  second  issue  concerns  the 
relatedness  of  the  visual  information  that  is  retained  from  previous  viewing.  An 
examination  of 'deictic  primitives'  (e.g.,  fixation  points)  and  their  importance 
for  accurate  internal  representations  is  being  investigated  by  covarying  the 
temporal  access  to  the  sensory  input  during  the  problem  solving  process. 


Preliminary  observations  reveal  some  notable  features  of  theeye  and  head  coordina¬ 
tion:  1)  Head  movement  fequently  leads  the  gaze  change  and  2)  The  fraction  of  gaze 
shift  due  to  head  movement  varied  from  20%  for  short,  vertical  movements,  tonearly 
100%  for  large  horizontal  movements.  It  was  in  general  dependent  on  the  sub-task, 
and  was  larger  for  horizontal  than  vertical  gaze  changes.  In  addition,  it  was 
shown  that  the  gaze  moved  first  to  the  model  area,  then  is  refined  down  to  the 
workspace,  while  the  head  simply  moves  to  the  model  area. 


deictic  primitives,  robotic  models,  observer  motion,  retinal 
input,  orientation,  saccade-contingent  display  update 


UNCLASSIFIED 


-  .  •  1'^fl  |  '  )  ,i<  -I  'i 

;  •  ?40f  i  .■ 

UNCLASSIFIED  j  UNCLASSIFIED  UNLIMITED 


THIS  DOCUMENT  IS  BEST 
QUALITY  AVAILABLE.  THE  COPY 
FURNISHED  TO  DTIC  CONTAINED 
A  SIGNIFICANT  NUMBER  OF 
PAGES  WHICH  DO  NOT 
REPRODUCE  LEGIBLY. 


COLLEGE  Of  ARTS  AND  SCIENCE 


UNIVERSITY  OF 

ROCHESTER 

May  26,  1993 

Lt.  Col.  Daniel  Collins 
AFOSR/NL 

Bolling  Air  Force  Base 
Washington,  DC  20332-6448 

Dear  Lt.  Col.  Collins: 

Annual  Technical  Report:  AFOSR  No  91-0332.  Reference  Frames  in  Vision 

Period  2/15/92-2/14/93 

The  goal  of  this  project  is  to  examine  the  consequences  of  observer  motion  for  visual 
function.  The  research  has  focused  on  two  issues:  One  issue  is  how  a  grossly  time-varying 
retinal  input  (because  of  eye,  head,  and  body  motion)  results  in  the  perception  of  a  continuous 
and  directionally  stable  visual  world.  A  second  issue  concerns  how  the  information  in 
successive  views  is  related,  and  the  nature  of  the  visual  information  retained  from  previous 
views.  Understanding  these  processes  is  important  for  a  wide  variety  of  visuo-motor  tasks. 

Three  graduate  students,  two  undergraduate  lab  assistants,  and  a  post  doctoral  fellow 
have  participated  in  the  project.  The  students  are  Jeff  Pelz,  Keith  Kam,  and  Joel  Lachter.  The 
post  doc  is  Steve  Whitehead.  Whitehead  moved  to  another  position  at  GTE  labs  in  August,  but 
has  continued  to  collaborate  on  the  project.  Lachter  is  currently  writing  up  his  thesis  and  should 
graduate  shortly.  They  are  supported  by  a  combination  of  funds  from  this  grant,  an  NIH 
training  grant,  and  University  funds. 

My  primary  effort  has  been  on  two  projects  with  Dana  Ballard,  Jeff  Pelz  and  Steve 
Whitehead,  on  performance  of  complex  tasks  involving  hand-eye  coordination.  In  the  earliest 
version  of  the  task  we  presented  stimuli  on  a  Mac  screen  and  monitored  eye  and  cursor 
position.  In  the  recently  developed  version  of  the  task  we  use  real  blocks  and  monitor  eye,  head, 
and  hand  position. 

Copying  Task  Using  Macintosh  Display. 

In  the  task  we  have  chosen,  subjects  copy  a  pattern  of  colored  blocks  on  a  computer 
screen  using  the  mouse  to  move  blocks  around  the  display.  Recent  successful  robotic  models  of 
complex  tasks  avoid  computationally  expensive  internal  representations  by  allowing  frequent 
access  to  the  sensory  input  during  the  problem  solving  process.  These  models  use  so  called 
'deictic  primitives'  in  which  aspects  of  a  scene  can  be  referred  to  by  denoting  that  part  of  the 
scene  with  a  special  marker,  such  as  the  fixation  point.  We  have  little  knowledge  of  how 
humans  actually  perform  in  comparable  sensori-motor  tasks.  We  have  shown  so  far  that 
human  performance  is  also  characterized  by  deictic  strategies  and  limited  memory 
representations.  This  suggests  that  current  approaches  in  robotics  are  also  useful  for 
understanding  human  brain  mechanisms.  It  also  suggests  a  computational  rationale  for  the 
limitations  on  human  working  memory.  The  limited  nature  of  human  working  has  been  taken  as 
a  kind  of  explanatory  primitive  in  understanding  cognitive  processes.  However,  there  has  been 
surprisingly  little  effort  directed  at  understanding  why  it  is  limited,  and  how  these  limitations 


274  Meliora  Hail 
University  of  Rochester 
Rochester,  New  York  14627 

3  JUT  1993 


play  themselves  out  in  normal  behavior.  The  'active  vision'  approach  in  robotics  forces  a  new 
consideration  of  the  computational  role  of  short  term  memory.  It  seems  likely  at  this  point  that 
there  is  a  real  advantage  to  be  gained  by  such  a  system  in  terms  of  simplifying  the  underlying 
cortical  decision  making  processes  and  minimizing  the  need  for  a  central  executor.  A  better 
understanding  of  how  the  system  works  as  a  whole  should  provide  better  guidance  in  how  to 
approach  the  underlying  neural  organization.  We  have  prepared  a  manuscript  on  this  work 
which  will  shortly  be  submitted  to  Nature.  The  work  was  presented  at  the  Spring  meeting  of 
ARVO,  and  at  the  European  Conference  on  Visual  Perception. 

Analysis  of  task  sub- structure.  As  well  as  examining  performance  at  the  overall  level  of 
strategies  used,  we  have  also  analyzed  performance  at  the  level  of  the  individual  actions. 
Analysis  of  the  fixation  patterns,  for  example,  has  suggested  that  fixation  has  a  bookkeeping 
role,  keeping  track  of  where  the  subject  is  in  performing  the  task.  We  are  continuing  this  kind  of 
analysis.  Many  aspects  of  the  data  remain  to  be  analyzed,  concerning  the  programming  of  the 
eye  and  hand  movements,  and  the  timing  of  the  cognitive  operations  involved  in  task 
performance.  We  will  use  this  data  to  elaborate  our  current  computational  model  of  the  task 

Experiments  involving  Saccade-Contingent  Display  Updating.  One  important  general  class  of 
experiments  involves  changes  in  the  visual  display  during  a  saccade.  In  the  current  grant  period 
we  have  written  programs  to  do  this  in  the  experimental  set  up  where  the  head  is  fixed  and  the 
eye  monitored  by  the  DPI  eyetracker  and  block  movements  are  made  using  the  mouse.  Using 
these  newly  developed  programs,  there  are  three  major  classes  of  questions  we  will  investigate: 
I.  What  is  the  nature  of  the  visual  information  retained  from  previous  views?  2.  What  are  the 
reference  frames  for  programming  the  various  movements  in  the  task?  3.  What  is  the  nature  of 
the  sub-components  of  the  task. 

Copying  Task  in  a  Natural  Environment 

In  addition,  we  have  begun  investigation  of  performance  using  real  blocks  and  hand 
movements,  with  the  subjects'  head  free  to  move,  using  an  ASL  head-free  eye  and  head  tracker, 
and  a  magnetic  hand  coil.  (This  equipment  was  bought  on  an  NIH  Resource  Development  Grant. 
Development  of  the  laboratory  was  undertaken  by  Pelz  and  an  undergraduate  lab  assistant 
supported  by  the  grant.  A  new  Mac  to  run  this  system  will  also  be  purchased  using  AFOSR 
funds,  in  order  to  improve  the  temporal  sampling  rate.)  This  has  provided  important  validation 
of  our  task  in  more  natural  conditions  in  addition  to  revealing  a  number  of  new  findings. 
Subjects  in  the  natural  task  perform  in  the  same  stereotypical  way  as  in  the  Mac  task, 
characterized  by  frequent  eye  movements  to  the  model  pattern.  Thus  the  use  of  short  term 
memory  is  extremely  limited,  even  to  the  extent  that  properties  of  a  single  block  are  acquired 
separately.  Even  when  the  cost  of  references  to  the  environment  was  increased  by  moving  the 
model  and  workspace  70  deg  apart  (requiring  large  head  movements),  the  subject  still  made 
frequent  reference  to  the  model.  Conversely,  other  experiments  revealed  that  performance 
declined  precipitously  when  frequent  access  to  the  model  during  task  performance  was 
prohibited. 

Subjects  differ  in  the  frequency  with  which  they  make  return  saccades  to  the  model  but  are 
remarkably  similar  in  other  aspects  of  performance,  such  as  fixation  duration,  time  for  pick  up 
and  for  put  down.  In  addition,  the  time  spent  in  the  mode!  area  does  not  change  significantly  in 
the  course  of  the  trial,  indicating  that  very  little  learning  of  the  model  takes  place. 

Preliminary  observations  reveal  some  notable  features  of  the  eye  and  head  coordination: 
1)  Head  movement  frequently  leads  the  gaze  change  and  2)  The  fraction  of  gaze  shift  due  to 
head  movement  varied  from  20%  for  short,  vertical  movements,  to  nearly  100%  for  large 
horizontal  movements.  It  was  in  general  dependent  on  the  sub-task,  and  was  larger  tor 
horizontal  than  vertical  gaze  changes.  3)  Head  and  gaze  paths  are  commonly  observed  to 
diverge.  While  the  head  performs  a  single  movement  from  the  resource  area  to  the  workspace 


after  picking  up  a  block,  the  gaze  moves  first  to  the  model  area,  then  down  to  the  workspace. 
This  suggests  that  the  eye  and  head  are  not  driven  by  a  single,  central  motor  command. 

Other  Studies. 

Keith  Kam  has  begun  an  investigation  comparing  the  effect  of  external  reference  frames  on 
perception  and  reaching.  Perception  and  action  have  often  been  thought  to  involve  different 
neural  channels.  Kam  is  examining  the  hypothesis  that  the  differences  are  more  simply 
accounted  for  in  terms  of  the  reference  frame  used  for  the  task  in  question.  These  experiments 
will  for  the  basis  for  his  dissertation,  which  he  plans  to  complete  by  the  end  of  the  year. 

Jeff  Pelz  and  I  have  also  continued  work  on  our  investigation  of  the  role  of  the  visual  scene 
in  visual  stability  and  direction  constancy,  using  an  afterimage  technique  (see  previous  reports). 
We  have  made  some  further  observations  on  the  nature  of  the  eye  movements  which  can 
suppressed  when  a  normal  scene  is  present,  and  have  prepared  a  manuscript  which  will  be 
submitted  to  Vision  Research  very  soon 

Joel  Lachter  has  continued  experiments  on  how  much  visual  information  gets  processed 
outside  the  focus  of  attention.  He  is  using  a  novel  technique  introduced  by  Rock  to  look  at  this 
question.  Data  collection  is  now  complete  and  he  is  writing  his  thesis.  His  experiments  suggest 
that  visual  processing  outside  the  focus  of  attention  is  minimal  Lachter  and  I  have  also  worked 
on  a  manuscript  on  the  role  of  attention  in  integrating  information  across  saccades.  We  plan  to 
finish  this  in  a  month  or  two. 

RELEVANT  PUBLICATIONS 

Hayhoe,  M.M.,  Lachter,  J.  &  Moeller,  P.  (1992)  Spatial  memory  and  integration  across  saccadic 
eye  movements.  In  K.  Rayner  (Ed.),  Eye  Movements  &  Visual  Cognition.  Springer-Verlag.  pp. 
130-145. 

Kam,  K.,  Moeller,  P.,  and  Hayhoe,  M.  Precision  of  the  eye  position  signal.  (1992)  in  Van 
Rensbergen,  J.  &  d'Ydewalle,  G.  (Eds.),  Studies  in  Visual  Attention.  North  Holland,  pp.  71-82. 

Ballard,  D.,  Hayhoe,  M.,  &  Whitehead,  S.  (1992)  Hand-Eye  coordination  during  sequential 
tasks.  Phil  Trans  Roy  Soc  bond  B.,  331-339. 

Hayhoe  Ballard  Whitehead  1993  Memory  Use  During  Hand-Eye  Coordination.  Proceedings  of 
the  Cognitive  Science  Society. 

MANUSCRIPTS  IN  PREPARATION 

Ballard  Hayhoe  Whitehead  Memory  Use  During  Hand-Eye  Coordination  (in  preparation). 

Moeller,  P.,  Hayhoe,  M.,  Ballard,  D.,  Albano,  J.  Saccades  to  remembered  visual  targets  and  the 
perception  of  spatial  position,  (in  preparation). 

Lachter,  J.,  Hayhoe,  M.  &  Feldman,  J.  Capacity  limits  in  the  integration  of  information  across 
saccades.  (in  preparation). 

Pelz,  J.  St.  Hayhoe,  M.  Influence  of  the  visual  scene  in  space  constancy,  (in  preparation) 

Kam,  Hayhoe  &  Moeller  Effect  of  Intervening  eye  movements  on  saccades  to  remembered  visual 
targets,  (in  preparation). 


PRESENTATIONS  AT  SCIENTIFIC  MEETINGS 


Ballard,  D.,  Hayhoe,  M.,  Li,  F.,  and  Whitehead,  S.  (1992)  Hand-eye  coordination  during 
complex  tasks.  Presented  at  the  annual  meeting  of  the  Association  for  Research  in  Vision  and 
Ophthalmology,  Sarasota,  Florida. 

Hayhoe,  M.,  Ballard,  D.,  Li,  F„  and  Whitehead,  S.  (1992)  Hand-eye  coordination  during 
complex  tasks.  Presented  at  the  European  Conference  on  Visual  Perception. 


Sincerely  yours. 


Mary  M.  Hayhoe 


